QMIR 2026

Week 1: Course Roadmap & Research Workflow

Tristan Muno
February 9, 2026

Agenda

  1. Introduction
    • About me
    • About you
  2. Course overview and expectations
    • What you will learn
    • What this course does not cover
    • Why the course is structured this way
  3. Getting started
    • Installation
    • First look at Positron

Introduction

Survey

Please fill out the following survey:

dertristan.limesurvey.net/qmir-2026-welcome

About me

About you

  • What name would you like us to call you?
  • What natural (spoken) languages do you speak or are currently learning? (e.g., German, Spanish, Arabic – not programming languages)
  • How do you usually spend your time outside university? (This can be anything: a side job, vocational/civilian service, volunteering, sports, creative hobbies, gaming, family time – whatever is part of your life)

About you continued

View source
library(ggpubr)

color_main <- "#003056"
color_highlight <- "#DE7E50"

ggplot(
  data = responses,
  mapping = aes(y = fct_rev(fct_infreq(Q806045)))
) +
  geom_bar(
    color = color_highlight,
    fill = "white",
    width = 0.5
  ) +
  theme_pubr() +
  labs(
    x = "Answer",
    y = "Count"
  ) +
  theme(aspect.ratio = 0.618)

Figure 1: Your answers to: What is your primary field of study (major/Hauptfach)?

View source
ggplot(
  data = responses,
  mapping = aes(y = fct_rev(fct_infreq(Q933620)))
) +
  geom_bar(
    color = color_highlight,
    fill = "white",
    width = 0.5
  ) +
  theme_pubr() +
  labs(
    x = "Answer",
    y = "Count"
  ) +
  theme(aspect.ratio = 0.618)

Figure 2: Your answers to: What is your secondary field of study (minor/Nebenfach)?

View source
ggplot(
  data = responses,
  mapping = aes(x = fct_infreq(Q897542))
) +
  geom_bar(
    color = color_highlight,
    fill = "white",
    width = 0.5
  ) +
  theme_pubr() +
  labs(
    x = "Answer",
    y = "Count"
  ) +
  theme(aspect.ratio = 0.618)

Figure 3: Your answers to: Which semester of your Bachelor’s program are you currently in?

View source
ggplot(
  data = responses,
  mapping = aes(x = fct_infreq(Q507207))
) +
  geom_bar(
    color = color_highlight,
    fill = "white",
    width = 0.5
  ) +
  theme_pubr() +
  labs(
    x = "Anwer",
    y = "Count"
  ) +
  theme(aspect.ratio = 0.618)

Figure 4: Your answers to: Have you taken any quantitative methods or statistics course before?

View source
ggplot(
  data = responses,
  mapping = aes(y = fct_infreq(Q609569))
) +
  geom_bar(
    color = color_highlight,
    fill = "white",
    width = 0.5
  ) +
  theme_pubr() +
  labs(
    x = "Count",
    y = "Answer"
  ) +
  theme(aspect.ratio = 0.618)

Figure 5: Your answers to: Why did you register for this course specifically?

View source
responses |>
  select(Q2610) |>
  mutate(
    Q2610_re = str_replace_all(Q2610, "&nbsp;", "")
  ) |>
  ggplot(
    mapping = aes(x = fct_infreq(Q2610_re))
  ) +
  geom_bar(
    fill = "white",
    color = color_highlight,
    width = 0.5
  ) +
  theme_pubr() +
  labs(
    y = "Count",
    x = "Answer"
  ) +
  theme(aspect.ratio = 0.618)

Figure 6: Your answers to: Which statements best describes how you feel about quantitative methods right now?

View source
# Create a named vector for renaming
tool_names <- c(
  "SQ001." = "R",
  "SQ002." = "RStudio",
  "SQ004." = "Rmarkdown",
  "SQ005." = "Git",
  "SQ006." = "GitHub",
  "SQ007." = "Python",
  "SQ008." = "Stata",
  "SQ009." = "SPSS",
  "SQ010." = "MS Excel",
  "SQ011." = "LateX",
  "SQ012." = "Command line/terminal",
  "SQ013." = "Positron",
  "SQ014." = "Markdown",
  "SQ015." = "Quarto",
  "SQ016." = "Overleaf",
  "SQ017." = "MS Word",
  "SQ018." = "VS Code"
)


tools_heardof_long <- responses |>
  select(contains("Q509412")) |>
  # rename columns by replacing SQ### with tool names
  rename_with(~ str_replace_all(., tool_names), .cols = everything()) |>
  # pivot longer for ggplot
  pivot_longer(
    cols = everything(),
    names_to = "tool",
    values_to = "answer"
  ) |>
  mutate(tool = str_replace(tool, "Q509412.", ""))

max_count <- tools_heardof_long |>
  group_by(tool) |>
  count(answer, sort = T) |>
  ungroup() |>
  slice(1) |>
  pull(n)


ggplot(
  data = tools_heardof_long,
  mapping = aes(y = fct_infreq(answer))
) +
  geom_bar(
    fill = "white",
    color = color_highlight,
    width = 0.5
  ) +
  theme_pubr() +
  labs(
    x = "Count",
    y = "Answer"
  ) +
  theme(aspect.ratio = 0.618) +
  scale_x_continuous(
    breaks = seq(from = 0, to = max_count, by = 1)
  ) +
  facet_wrap(~tool)

Figure 7: Your answers to: Which of the following tools have you heard of before?

View source
tools_used_long <- responses |>
  select(contains("Q928658")) |>
  # rename columns by replacing SQ### with tool names
  rename_with(~ str_replace_all(., tool_names), .cols = everything()) |>
  # pivot longer for ggplot
  pivot_longer(
    cols = everything(),
    names_to = "tool",
    values_to = "answer"
  ) |>
  mutate(tool = str_replace(tool, "Q928658.", ""))

max_count <- tools_used_long |>
  group_by(tool) |>
  count(answer, sort = T) |>
  ungroup() |>
  slice(1) |>
  pull(n)


ggplot(
  data = tools_used_long,
  mapping = aes(y = fct_infreq(answer))
) +
  geom_bar(
    fill = "white",
    color = color_highlight,
    width = 0.5
  ) +
  theme_pubr() +
  labs(
    x = "Count",
    y = "Answer"
  ) +
  theme(aspect.ratio = 0.618) +
  scale_x_continuous(
    breaks = seq(from = 0, to = max_count, by = 1)
  ) +
  facet_wrap(~tool)

Figure 8: Your answers to: Which of the following tools have you actively worked with before?

View source
ggplot(
  data = responses,
  mapping = aes(y = fct_infreq(Q278073))
) +
  geom_bar(
    fill = "white",
    color = color_highlight,
    width = 0.5
  ) +
  theme_pubr() +
  labs(
    x = "Count",
    y = "Answer"
  ) +
  theme(aspect.ratio = 0.618)

Figure 9: Your answers to: How would you describe your overall confidence in working with code?

View source
items_map <- c(
  "Q751740" = "Statistical analysis is mainly about determining whether a hypothesis is true or false.",
  "Q91718" = "Empirical data can increase or decrease our confidence in a theoretical claim.",
  "Q797817" = "Uncertainty should be explicitly reported when presenting empirical results.",
  "Q235999" = "Scientific knowledge is always provisional and subject to revision with new data.",
  "Q402631" = "A single statistical results can definitely prove or disprove a theory.",
  "Q613049" = "Models are simplifications of reality rather than exact representations."
)

responses |>
  select(all_of(c(
    "Q751740",
    "Q91718",
    "Q797817",
    "Q235999",
    "Q402631",
    "Q613049"
  ))) |>
  pivot_longer(
    cols = everything(),
    names_to = "question",
    values_to = "response"
  ) |>
  mutate(
    question = items_map[question],
    question = str_wrap(question, width = 40)
  ) |>
  ggplot(
    mapping = aes(x = response)
  ) +
  geom_bar(
    fill = "white",
    color = color_highlight,
    width = 0.5
  ) +
  theme_pubr() +
  labs(
    y = "Count",
    x = "Answer"
  ) +
  scale_x_continuous(
    breaks = 1:5,
    labels = c(
      "Strongly\ndisagree",
      "disagree",
      "neither\nagree nor\ndisagree",
      "agree",
      "strongly\nagree"
    )
  ) +
  theme(
    aspect.ratio = 0.618,
    axis.text.x = element_text(size = 8)
  ) +
  facet_wrap(~question)

Figure 10: Your answers in the agreement/disagreement items I.

View source
items_map <- c(
  "Q859946" = "Whether it is statistically significant.",
  "Q65120" = "The size of the effect.",
  "Q157010" = "The uncertainty around the estimate.",
  "Q233596" = "Whether the result fits the theory.",
  "Q699812" = "Whether the model assumptions seem plausible.",
  "Q975313" = "I am usually unsure how to interpret such results."
)

responses |>
  select(all_of(names(items_map))) |>
  pivot_longer(
    cols = everything(),
    names_to = "question",
    values_to = "response"
  ) |>
  mutate(
    question = items_map[question],
    question = str_wrap(question, width = 40)
  ) |>
  ggplot(
    mapping = aes(x = response)
  ) +
  geom_bar(
    fill = "white",
    color = color_highlight,
    width = 0.5
  ) +
  theme_pubr() +
  labs(
    y = "Count",
    x = "Answer"
  ) +
  scale_x_continuous(
    breaks = 1:5,
    labels = c(
      "Strongly\ndisagree",
      "disagree",
      "neither\nagree nor\ndisagree",
      "agree",
      "strongly\nagree"
    )
  ) +
  theme(
    aspect.ratio = 0.618,
    axis.text.x = element_text(size = 8)
  ) +
  facet_wrap(~question)

Figure 11: Your answers in the agreement/disagreement items II: Imagine you read a study claiming that attending a political debate doubles the likelihood of a person voting in the next election. When you see a numerical result like this (e.g., an estimate or coefficient), what are you usually most interested in?

Course overview and expectations

Course website

What you will learn

  • How to build a reproducible, open-source research workflow – integrating data, code, analysis, and writing into a transparent scientific process
  • How to use statistical models to learn from data – understanding what models assume, what they estimate, how they update our beliefs, and how to interpret results responsibly
  • How to communicate empirical results clearly and responsibly – through well-structured reports, visualizations, and reproducible documents

What this course does not cover

  • Data collection and research design – we work with existing datasets and do not cover survey design, measurement strategies, or fieldwork (the university offers dedicated courses and resources in these areas)
  • Causal inference methods – we focus on analyzing and interpreting associations between variables rather than formal causal identification (specialized courses cover causal inference in depth)
  • Substantive political theory or normative argumentation – we use real political science examples, but the focus is methodological and conceptual training rather than developing theoretical frameworks

Why the course is structured this way

Why the course is structured this way

  • An (toy) example of a fully integrated, reproducible research pipeline in action: Quarto manuscript
  • We will build up to this step by step, learning and integrating each piece along the way

Why the course is structured this way

  • Admittedly overkill for simple tasks, but it provides a scalable, FOSS toolbox for complex projects, advanced models, and fully reproducible workflows
  • Builds a foundation in quantitative methods and data science that’s useful across disciplines and careers, helping you understand research from other fields while gaining essential data skills

Questions?

Getting started

Installation

Installation guide

First glance at Positron

Positron documentation

Thank you for your attention and see you next week!

Please make sure to install all software and send me your GitHub name or email address.